Teaching with Rewards and Punishments: Reinforcement or Communication?
نویسندگان
چکیده
Teaching with evaluative feedback involves expectations about how a learner will interpret rewards and punishments. We formalize two hypotheses of how a teacher implicitly expects a learner to interpret feedback – a reward-maximizing model based on standard reinforcement learning and an action-feedback model based on research on communicative intent – and describe a virtual animal-training task that distinguishes the two. The results of two experiments in which people gave learners feedback for isolated actions (Exp. 1) or while learning over time (Exp. 2) support the action-feedback model over the reward-maximizing model.
منابع مشابه
Reward Processing: a Global Brain Phenomenon? 1
22 Rewards and punishments (reinforcement) powerfully shape behavior. 23 Accordingly, their neuronal representation is of significant interest, both for 24 understanding normal brain-behavior relationships and the pathophysiology of 25 disorders such as depression and addiction. A recent article by Vickery and 26 colleagues in Neuron provides evidence that the neural response to rewards and 27 ...
متن کاملReward processing: a global brain phenomenon?
Rewards and punishments (reinforcement) powerfully shape behavior. Accordingly, their neuronal representation is of significant interest, both for understanding normal brain-behavior relationships and the pathophysiology of disorders such as depression and addiction. A recent article by Vickery and colleagues (Neuron 72: 166-177, 2011) provides evidence that the neural response to rewards and p...
متن کاملNeuro Forum Reward processing: a global brain phenomenon?
Clark AM. Reward processing: a global brain phenomenon? J Neurophysiol 109: 1–4, 2013. First published July 18, 2012; doi:10.1152/jn.00070.2012.—Rewards and punishments (reinforcement) powerfully shape behavior. Accordingly, their neuronal representation is of significant interest, both for understanding normal brain-behavior relationships and the pathophysiology of disorders such as depression...
متن کاملTemporal difference learning is favored for rewards, but not punishments, in simulations and human behavior
Evidence indicates that dopaminergic neurons in basal ganglia implement a form of temporal difference (TD) reinforcement learning. Yet, while phasic dopamine levels encode prediction errors of rewarding outcomes, the encoding of punishing outcomes is weaker and less precise. We posit that this asymmetry between reward and punishment reflects functional design. In order to test this hypothesis, ...
متن کاملReinforcement learning: the good, the bad and the ugly.
Reinforcement learning provides both qualitative and quantitative frameworks for understanding and modeling adaptive decision-making in the face of rewards and punishments. Here we review the latest dispatches from the forefront of this field, and map out some of the territories where lie monsters.
متن کامل